Establishment of a Dataset for the Traditional Korean Medicine Examination in Healthy Adults

We established a protocol for the traditional Korean medicine examination (KME) and methodically gathered data following this protocol. Potential indicators for KME were extracted through a literature review; the first KME protocol was developed based on three rounds of expert opinions. The first KME protocol’s feasibility was confirmed, and data were collected over four years from traditional Korean medicine (KM) hospitals, focusing on healthy adults, using the final KME protocol. A literature review identified 175 potential core indicators, condensed into 73 indicators after three rounds of expert consultation. The first KME protocol, which was categorized under questionnaires and medical examinations, was developed after the third round of expert opinions. A pilot study using the first KME protocol was conducted to ensure its validity, leading to modifications resulting in the development of the final KME protocol. Over four years, data were collected from six KM hospitals, focusing on healthy adults; we obtained a dataset comprising 11,036 healthy adults. This is the first protocol incorporating core indicators of KME in a quantitative form and systematically collecting data. Our protocol holds potential merit in evaluating predisposition to diseases or predicting diseases.


Introduction
Qualitatively recorded data based on human sensory perception can be transformed into quantitative metrics through digital technology [1].Digital healthcare technology, exemplified by smart devices, employs high-performance sensors and computational processing units [2].These technologies are proxies for the human senses and memory, effectively substituting subjective observations with objective and quantifiable data [3].
Within the framework of a single official national health system (NHS), Korea administers a bifurcated system that separately offers conventional medicine and traditional Korean medicine (KM) services.Specifically, Korean medicine doctors (KMDs), under governmental regulations, deliver healthcare services such as acupuncture, moxibustion, cupping, manipulation, and herbal medicine.However, these services constitute a minor fraction within the Korean NHS [4].
Modern medicine, based on quantifiable data, is a prominent advancement in medical technology, mitigating human diseases and extending healthy lifespans [5,6].In the field of 2 of 13 KM, there has been a continuous evolution in technology through the application of cuttingedge techniques.Moreover, the use of large data has become increasingly prominent in healthcare and KM.In particular, the Sasang constitution is a crucial resource in KM that emphasizes the preventive aspects of health related to lifestyle, nutrition, and behavior [7].Analyzing Sasang constitutional data enables us to understand a patient's constitutional characteristics and develop personalized KM treatment methods [8,9].KM emphasizes individualized treatment based on constitutional characteristics, and leveraging such data can greatly assist in finding optimal treatment methods tailored to each patient's unique constitution [10].Recently, the Sasang constitution multicenter bank for Koreans has been continuously built quantitatively [11].However, limitations still exist in the field of KM owing to the scarcity of structured and quantitative data.
In this study, we developed a traditional Korean medicine examination (KME) protocol by standardizing and quantifying the measurement protocol based on the core indicators of KM commonly used to evaluate the health status of patients in KM clinical practice, and systematically collected data.The KME protocol holds potential merit in assessing the risk of diseases, or in evaluating the anticipation of diseases.In addition, the KME dataset aimed to explore the potential of using large data for KM.

Materials and Methods
The workflow comprised the following specific steps (Figure 1).
Modern medicine, based on quantifiable data, is a prominent advancement in medical technology, mitigating human diseases and extending healthy lifespans [5,6].In the field of KM, there has been a continuous evolution in technology through the application of cutting-edge techniques.Moreover, the use of large data has become increasingly prominent in healthcare and KM.In particular, the Sasang constitution is a crucial resource in KM that emphasizes the preventive aspects of health related to lifestyle, nutrition, and behavior [7].Analyzing Sasang constitutional data enables us to understand a patient's constitutional characteristics and develop personalized KM treatment methods [8,9].KM emphasizes individualized treatment based on constitutional characteristics, and leveraging such data can greatly assist in finding optimal treatment methods tailored to each patient's unique constitution [10].Recently, the Sasang constitution multicenter bank for Koreans has been continuously built quantitatively [11].However, limitations still exist in the field of KM owing to the scarcity of structured and quantitative data.
In this study, we developed a traditional Korean medicine examination (KME) protocol by standardizing and quantifying the measurement protocol based on the core indicators of KM commonly used to evaluate the health status of patients in KM clinical practice, and systematically collected data.The KME protocol holds potential merit in assessing the risk of diseases, or in evaluating the anticipation of diseases.In addition, the KME dataset aimed to explore the potential of using large data for KM.

Materials and Methods
The workflow comprised the following specific steps (Figure 1).

Preliminary Review
A comprehensive list of potential core indicators for KME was compiled by three KMDs through a literature review encompassing several sources.This included an analysis of published KM textbooks, the current status of KM questionnaires or device development, KM diagnostic questionnaires and developed devices, such as those for tongue and pulse diagnosis, as quantitative measurements via the Oriental Medicine Advanced

Preliminary Review
A comprehensive list of potential core indicators for KME was compiled by three KMDs through a literature review encompassing several sources.This included an analysis of published KM textbooks, the current status of KM questionnaires or device development, KM diagnostic questionnaires and developed devices, such as those for tongue and pulse diagnosis, as quantitative measurements via the Oriental Medicine Advanced Searching Integrated System (OASIS) database (https://oasis.kiom.re.kr/index.jsp(accessed on 18 May 2018)), clinical research protocols for projects conducted by the Korea Institute of Oriental Medicine (KIOM), and the status of domestically available commercial medical devices.

Expert Opinions
The panel participating in the two rounds of the survey comprised 15 KMDs recommended by the Society of Korean Medicine and more than ten diverse specialty societies within the Association of Korean Medicine.The first survey assessed whether the nine primary categories and 39 subcategories identified in the preliminary review, along with the 175 indicators contained within them, were suitable candidates for KME core indicators.
To assess the suitability of the categorized core indicators for KME, a secondary survey was conducted among the same experts as in the first round.Each indicator was evaluated on a 3-point Likert scale: 1 = unsuitable, 2 = suitable, and 3 = very suitable.The criterion for indicator selection was receiving an average expert response score of 2.5 or higher on a three-point scale.These core indicators were bifurcated into questionnaire categories, such as thirst and xerostomia, and measurement categories such as halitosis.To assist experts in their evaluation, validated questionnaires, standardized assessment protocols, proxy indicators, and exemplar units of measurement were proposed for each category.Additionally, indicators with an average score of less than 2.5 were reconsidered through internal discussions among the researchers under three conditions: appropriate measures were already available in the clinical practice of KM, such as body shape/posture, halitosis, and electroacupuncture according to Voll (EAV); the proportion of respondents with a score of 2 or higher was greater than 80%; or they were deemed meaningful when collected together, such as type (2.4), frequency (2.5), and duration (2.5) of exercise.Finally, by predetermined criteria, the chosen core indicators were simply incorporated into the KME indicators.
In the third round, the final selection of the questionnaire and measurement methods for the first KME protocol development was based on the indicators identified through the secondary expert survey.The indicators were divided into questionnaire and medical examination categories.The task involved verifying whether the indicators could be assessed in a real-world clinical setting.Additionally, it encompassed the process of choosing a method or device capable of quantitatively evaluating these indicators and conducting preliminary measurements.At this stage, four biomedical engineers and six KMDs were consulted.The six KMDs included three specialists in KM diagnosis, one specialist in KM neuropsychiatry, one specialist in KM internal medicine, and one specialist in KM gynecology.For areas requiring expert opinion, such as color quantification through photographs or quantification of dietary intake and eating habits, additional consultations were conducted with a professional photographer and a professor of nutrition science.

Pilot Study Using the First KME Protocol
The development of the first KME protocol was based on the core indicators of KME.To evaluate the feasibility of this study, a pilot study was conducted by 7 KMD experts using the developed protocol.Through the pilot study, several points were improved, and the final protocol was confirmed.

Final KME Protocol Development and Update
The final KME protocol was implemented in 2020 and updated until 2023.These updates aim to prevent response errors and enhance measurement accuracy during the data collection and cleaning processes.

Establishment of Electronic Case Report Form System
The mobile application software used for completing the questionnaires and the electronic case report form (eCRF) were developed in compliance with the relevant guidelines provided by the Ministry of Food and Drug Safety [12].The questionnaire completion software is compatible with both Android and iPhone operating systems, whereas the eCRF has been optimized for the Chrome web browser, enabling efficient recording and management of all data collected during clinical trials.Clinical research participants completed all questionnaire information directly through a mobile application.All participants were required to thoroughly complete the entire questionnaire through a dedicated mobile application before proceeding to the medical examination phase of the KME program.
The app automatically processed the data securely before transferring them to the server.At each hospital, designated researchers entered medical examination data using the eCRF system.In turn, clinical research associates ensured data accuracy through ongoing monitoring.The data obtained from eCRF were processed using R version 4.2.1 (R Foundation for Statistical Computing, Vienna, Austria).

Dataset Collection
This study collected detailed observational data.The KME population comprised non-institutionalized, healthy adults residing in Korea.This study was conducted at the following six sites: Naju Dongshin University Korean Medicine Hospital, Pusan National University Korean Medicine Hospital, Gachon University Gil Medical Center, Dongguk University Ilsan Oriental Hospital, Dunsan Korean Medicine Hospital of Daejeon University, and Semyung University Korean Medicine Hospital.The study was conducted in each hospital with the same KME protocol and standard operating procedures (SOPs) every year.All hospitals were furnished with the same medical measuring devices, which were tested and validated.To prepare for the clinical study, we conducted three rounds of training following a standardized SOP.Furthermore, to ensure protocol compliance, we performed regular monitoring visits.The device data uploaded to the eCRF were monitored in-house in real time.We conducted an audit to guarantee the reliability of the data.All participants, regardless of sex, were voluntarily recruited on a first-come first-served basis from those who completed the national general health screening.All participants were adults aged ≥19 years old with no cognitive impairment.The exclusion criteria were as follows: inability to move independently or use measuring devices, such as an InBody device, and current diagnosis of the following: cardiovascular diseases (e.g., myocardial infarction, congestive heart failure, angina, and arrhythmia), cerebrovascular diseases (e.g., cerebral infarction and paralysis), malignant neoplasms (i.e., cancer), mental illnesses (e.g., depression and anxiety disorder), rheumatoid arthritis, and thyroid diseases (e.g., hyperthyroidism and hypothyroidism).This study was conducted in accordance with the Declaration of Helsinki and was approved by the Institutional Review Board of each of the six hospitals assessed.Informed consent was obtained from all the participants involved in the study.

Preliminary Review
Initially, we examined the status of the questionnaire development for KM diseases according to the Korean Standard Classification of Diseases.There were 32 questionnaires for pattern identification of diseases or symptoms, 9 for Sasang constitution diagnosis, 10 for health assessment, and 4 for treatment evaluation.Subsequently, we scrutinized the quantification tools for diagnostic indicators in KM textbooks.In the four core diagnostic examinations, 20 indicators for inspection, 7 indicators for listening and smelling, 6 indicators for inquiry, and 3 indicators for palpation were identified.Moreover, we examined eight research projects conducted by the KIOM, revealing the utilization of 12 types of equipment tests.These include pulse tonometry devices, tongue image analysis systems, abdominal examinations, heart rate variability measurements, and electroencephalography.Finally, among domestic commercial medical devices, 224 types of diagnostic or measuring devices were identified as potential candidates for use in KME.Through a review of research data from the three KMDs, 175 potential core indicators for KME were categorized into 39 subcategories under 9 primary categories: daily living symptoms, health habits, subjective symptoms, health status, KM diagnosis, laboratory results, medical information, DNA analysis, and menstrual health.

The First, Second, and Third Expert Opinions
Through the first round of expert opinions on the suitability of KME core indicator candidates, the nine primary categories were condensed into four: daily living symptoms, health habits, menstrual health, and KME diagnosis.The experts' main opinion was that categories such as subjective symptoms, health status, and DNA analysis should be considered key indicators for each disease rather than KME core indicators.The 39 subcategories were reduced to 22; newly added subcategories included body shape/posture and facial color under daily living symptoms, iris diagnosis, and EAV under KM diagnosis.Finally, the total number of indicators was reduced from 175 to 98.
The indicators selected from the second round of expert opinions were classified into two categories: questionnaires and measurements.The questionnaire categories included thirst/xerostomia/amount of water consumed, digestion, stool, urine, sleep, sweating, heat and cold patterns, emotions, stress, eating habits, alcohol habits, smoking habits, exercise habits, and menstrual health status.The measurement category encompassed body shape/posture (3D body scan), facial color, halitosis, digestion (electrogastrogram, EGG), urine (chemical ingredient, color), stress (heart rate variability, HRV), heat and cold patterns (resting metabolic rate, RMR; ventilation rate, VR), height/weight, body composition, blood pressure, body heat (thermography), pulse diagnosis, tongue diagnosis, abdominal examination, and Sasang constitution.In this round, the total number of indicators was reduced from 98 to 86.
In the third round, we aimed to restructure the method of measuring indicators in an actual clinical setting using questionnaires and medical examinations.Some indicators that were difficult to implement in the field of KM were excluded.For example, although the state of gastrointestinal movement is an indicator, the electrogastrography suggested by experts to measure it could not be legally used in the field of KM.Through this process, the total number of indicators was reduced from 86 to 73.Furthermore, some indicators were included in both the questionnaire and medical examination.For instance, stress was selected to be measured by both the Perceived Stress Scale in the questionnaire and by HRV in the medical examination.Indicators related to heat and cold patterns were duplicated in the questionnaire and measured with RMR and VR through Quark RMR indirect calorimeter.Finally, all the information measured by the device we selected in the medical examination was included in the measurement items.For example, during the chemical analysis of urine, not only the items related to urine color such as urobilinogen and bilirubin, but also additional incoming items such as glucose, urine specific gravity, ketone body, occult blood, pH, and urine nitrite were included.
The KME program was divided into two categories: questionnaires and medical examinations.The questionnaire categories included thirst/xerostomia/amount of water consumed, digestion, stool, urine, sleep, sweating, heat and cold patterns, eating habits, menstrual health, alcohol habits, smoking habits, exercise habits, Sasang constitution questionnaire, Pittsburgh Sleep Quality Index (PSQI), Perceived Stress Scale (PSS), and Core Seven Emotions Inventory (CSEI).The medical examination category comprised onsite measurement equipment tests, including chemical analysis of urine, urine color, body temperature, skin moisture, and lipids, tongue diagnosis, salivary flow rate, halitosis, pulse diagnosis, height/weight, body composition, HRV, RMR, VR, abdominal examination, blood pressure, Sasang constitution measurement, 3D body scan, digital thermography, and multiple allergen simultaneous test (MAST) (Table 1).The questionnaire category was systematically refined for readability and participant convenience, and SOPs were developed for each measurement method.

Pilot Study Using the First KME Protocol
We developed an inaugural protocol for KME, encompassing parameters such as KME spatial prerequisites, device installation milieu, SOPs for questionnaire and device measurement, stipulations for participant adherence, and requirements for examiners.A pilot study was conducted with 90 participants at 3 KM hospitals in 2019 [13].For the study, health check-up data were added to the final protocol to complement the study's results and confirm the participant's health status more objectively.To increase the validity of the survey content, the indicator 'exercise habits' was changed to the International Physical Activity Questionnaire Short Form (IPAQ-SF), and the CSEI 100-item questionnaire was changed to the 28-item Core Seven Emotions Inventory-Short Form (CSEI-S) to increase the subject's understanding and ease of response.Finally, active oxygen and posture analyses were added to the medical examinations.All experts who participated in the pilot study concurred with the revised KME protocol.

Final KME Protocol
The final KME protocol is presented in Table 2 [14].The protocol consisted of the demographics of the participants, 16 questionnaires, 20 medical examinations, and the MAST.

The KME Protocol Updates
Table 3 shows the content updated annually while studying the KME protocol.Lifestylerelated diseases and self-rated health were added to the participants' demographic information for data analysis based on the health status level of the participants, starting in 2022.Lifestyle-related diseases included the diagnosis of hypertension, diabetes, and hyperlipidemia, along with their diagnosis time and current prevalence [24].Self-rated health was defined as the degree of health the participants considered themselves to be in on a 5-point Likert scale [25].During the medical examination, the posture analysis, 3D body scans, and digital thermography were altered.For posture analysis, the direction of sight was added in 2021 and 2022; however, from 2023 onward, all posture analysis measurements were removed from the protocol.Concurrently, the 3D body scan was also removed in 2023.This decision was based on a preliminary analysis conducted with data collected up until 2022, which indicated that both the 3D body scan and posture analysis had low data utilization.Digital thermography added peripheral body temperature measurements of the palm, back of the hand, and sole starting in 2023.MAST has been updated to test 108 types since 2023, with 15 types which people have recently been encountered frequently with added to the existing 93 types.

Status of Dataset Collection
An analysis of the demographic characteristics, physical features, and habits of 11,036 healthy adults was conducted over three years, segmented by year.The age distribution revealed that the majority of participants were in their 40s.In terms of sex distribution, 70.9% were female and 29.1% were male.The participants exhibited an average height of 163 ± 8.28 cm and an average weight of 63.5 ± 12.3 kg.Among the participants, 80.6% were non-smokers, and 64.7% were alcohol consumers (Table 4).

Discussion
To standardize the field of Korean medicine (KM), this study aimed to develop a protocol for KME and systematically collect data according to established KME procedures.
Our study focused on a population of healthy adults, resulting in a comprehensive dataset comprising 11,036 individuals.This was an observational study conducted on a voluntary basis, accepting participants on a first-come, first-served basis without any sex distinction.However, the results indicated a higher participation rate among females.The initial target size for the participants was 12,000, but this was downsized due to COVID-19, ultimately comprising 11,036 individuals.
The process of standardization, aimed at achieving an optimal degree of order, led to the creation of a universally comprehensible common system [26].High-quality quantitative clinical data characterized by adherence to stringent measurement standards were collected [27].The process involved specifying the physical quantity under measurement using standardized tools and methods [28] ensuring consistency, and mitigating the risk of error.Collated data were then preserved in a standardized format, maintaining their integrity and original form.Establishing an ecosystem for the systematic collection of real-world data from KM clinical procedures is critical.This methodological approach not only ensures data reliability but also propels the scientific advancement of KM.
In the process of establishing the KME protocol, we encountered several challenges and have taken measures to address them.Firstly, the need to minimize device malfunction was identified.Given the large number of participants measured, it is crucial to ensure that there are no measurement errors caused by the devices utilized.To prevent such errors, we used sufficiently validated equipment and deployed dedicated personnel at each site to immediately resolve any device errors.Additionally, regular staff training according to the SOPs enhanced the work capabilities of clinical coordinators [13].Secondly, participant satisfaction in KME was of utmost importance.To augment this satisfaction, it is vital to provide participants with their results.Additionally, the analysis of examination results and the interpretation of KM diagnosis to participants were essential components of this process.These experiences provide valuable insights into the practical considerations and potential pitfalls in the implementation of a KME protocol and may serve as a guide for future research in this area.
The Tongue Diagnosis Data Center was established to conduct standardized research on diseases using data collected through tongue diagnosis, which is a component of KME programs.The center was designated the 63rd National Reference Standards Data Center in Korea by the Korean Agency for Technology and Standards, effective from 19 January 2023.In general, reference standards refer to measures officially validated through scientific analysis and evaluation of the accuracy and reliability of the measured data and information.
The traditional subjective expression of a patient's pulse being weak has been transformed into an objective context with the establishment of these data.This step enables the provision of an objective indicator stating that, for example, 'Your pulse corresponds to the lower 20% of the pulse strength standard of Korean women in their 40s'.This approach quantifies the traditional qualitative aspects of pulse diagnosis and enhances precision and reproducibility.
The significance of this study lies in its pioneering approach to data collection in KM.It establishes a definitive set of information deemed essential for data collection during clinical practice, standardizes the data collection process, and enhances the reliability and applicability of the data for further research and clinical studies.Therefore, it is essential to create a comprehensive dataset for healthy individuals.This dataset served as a standardized reference, enabling a comparative analysis of the current condition of the patients.The ultimate aim is to use this extensive dataset to derive insights into patient conditions, thereby transforming the traditionally subjective patient status into an objective, quantifiable metric.Given the relative simplicity of data collection from a healthy population, our primary emphasis was on the rapid construction of a robust reference database for this demographic.At the same time, if we methodically gather disease-related data for our healthy participants, a comprehensive and scientifically rigorous approach to data collection and analysis will be ensured.This strategy not only bolsters the precision and reliability of our data but also significantly contributes to the progression of medical research and patient care.
However, this study has some limitations.The protocol, initially intended for a healthy population, has not been validated for use with specific disease populations, which limits its reliability as a standard.Amassing quantitative data from patients diagnosed with specific diseases is critical.However, without a comparative analysis involving a dataset from diseased individuals, the protocol's suitability for these populations is uncertain.Furthermore, despite the collection of unstructured data such as images and sounds, in addition to structured data, a network environment based on hyperconnectivity drives the need to rapidly collect and integrate large volumes of both structured and unstructured data.This holistic approach to data collection ensures the creation of comprehensive datasets that encapsulate diverse forms of information, thereby enhancing the depth and breadth of the analysis [29].Moreover, the items evaluated through questionnaires are not continuous variables, which may impose limitations on the analytical methods and outcomes.Additionally, the potential impact of lifestyle-related diseases such as hypertension, diabetes, and hyperlipidemia on the test indicators must be considered during data analysis, which we only incorporated starting half-way through the project.

Conclusions
This study represents a significant step toward the standardization of KM, the development of a protocol for KME, and the systematic collection of data by KMEs' standard operating procedures.In contemporary medical practice, the cornerstone of diagnosis lies in the comparison of a patient's diagnostic indicators with the average range established for a healthy population.However, in the realm of KM, the absence of physiological definitions for certain concepts has hindered the development of standard datasets for patient anomalies and average values for healthy individuals.This study is pioneering in its approach to standardize the clinical diagnostic practices of KMDs, moving away from reliance on subjective senses.
Currently, the absence of a standardized diagnostic protocol means that medical data in KM are not being stored as digitized data.As the standardized KME protocol and the dataset become more prevalent, and as sufficient data on the normal values for healthy individuals and anomalies in patients accumulate, we will be poised to embrace the era of digital medicine.The prognostic information derived from the diagnosis and treatment in Korean medicine could be standardized and transformed into big data.This development will foster a favorable environment for the integration of artificial intelligence into KM.Despite its analog nature, which has previously been deemed a barrier, this approach paves the way for potential breakthroughs in the field of KM.Overall, this data collection approach in the KM field sets a new standard, paving the way for future research and clinical studies.Informed Consent Statement: Informed consent was obtained from all the participants involved in this study.

Figure 1 .
Figure 1.Overview of the flow chart.

Figure 1 .
Figure 1.Overview of the flow chart.

Table 1 .
Three rounds of expert opinions for traditional Korean medicine examination protocol.

Table 2 .
The final protocol including the core indicators of traditional Korean medicine examination.

Table 3 .
Annual updates of the traditional Korean medicine examination protocol.

Table 4 .
Characteristics of the traditional Korean medicine examination participants.